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TTIXE OF THE INVENTION 
Root Cause Correlation in Connectionless Networks. 

FIELD OF THE INVENTION 
5 The pres^it invention i^es to conqHiter network technology in general, and in 

particular to conflation of network errors to root causes in connectionless networks. 

B AcaeKSROUND OF THE nsr^^ 

Connectionless confer networks, such as Tssbexttei Protocol (IP) networks, 
10 are typically formed by cotQiecting invdtiple roxAers to each G*her uising etthfsr point-torpoint 
connections or the Data link Lay«- of the International Standard OrgaoizaEtibrfs Opien 
Systran Interconnect (ISO/OSI) network model, commonly referred to as "layer 2." One of 
the main features of a connectionless network is the ability of a network node, such as a PC, 
to connect directiy to any of the routers and said/recdve packetized data to/from any other 
15 netwoik node connected to any otiier router. To acconq>]ish tins eadi node is tyjacally 
uniqu^ identified by a unique netwoik address, known in IP networks as an IP address. 

Routing of packets in a connectionless computer network is now described by 
way of example with inference to Rg. 1. When a node A sends a padcet to a node B, A 
must spedfy the address of B as the destination address of the padcet The first router Rl 
20 that accepts the packet forwards the packet to the next router R2 on the path to B, 
w*iereupon R2 forwards the packet to the next router R3 on the path to B, and so on. 
When the packet reaches the router to ^Mch B is <firecUy connected, it is forwarded to B. 
It may thus be serai that, for any ^ven destination address to which a packet is addressed, 
every router in the network should know the packet's next "hop," i.e., to which next router 
25 thepacfcetistobefOTwarded. Each router tyjttcally maintains this information in a routing 
table -vMch contains a mapping between addresses or address groups such as IP subnets, 
and the next hop for packets destined for these addresses. 

30 When a link connecting two routers in a network feils, a partitioning of the 

network may occur. Thus inFig. 1, if the link between Rl and R2 feils, nodes A and C can 
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stmcomnuuucate with eadi other but not y«th nodes B and D, and vio^ Eadirouter 
wiU typicaUy automaticaUy detect this situation and update its routing table accordingly, 
such as by eliminating entries whose next hop is unreachable. However, nodes in one 
partition may still try to send packets to nodes in the other partition. When this occurs, a 
5 "no route to destination" error is typically generated and logged by the first router to detect 
the problem, whidi then reports the problem to the network management system (NMS). 
The NMS must then dedde what action to take, such as tradng the error to its root cause. 
In large networks where there iw be many active conmmmicafidn swsions between nodes 
at one time, a sin^e link feilure event liight cause hutoraious "no route 1x> destination*'. 
10 notifications to be g^^ated in weryrottter k 

destined for the other partitioii and reported to the Thus. v*ere the existence of a. 

link failure is akeady known to the NMS, it would be advantageous to know whether or not 
a routing error is caused by the link failure, as weU as which nodes might be affected by the 
link feihjre, obviating the need for the NMS to take action that it would normaUy take. 

15 

SUMMARY OF THE INVENTION 
The present invention provides fi>r the correlation of routing errors to link 

fidlures in a connectionless network. 

In one aspect of the presart invention a method is provided for correlating 

20 routing errors to link feilures in a network, the method induding detecting a link feilure 
between a first and a second router m a network, associating a first node address indicated 
in a first routing table of the fiarst router with a first partition of the network, where a next 
hop of a packet destined for tiie first node address is tiie second router, associating a second 
node address indicated in a second routing table of the second router with a second 

25 partition of the network, where a next hop of a packet destined for the second node address 
is the first router, and correlating an error notification resulting from tiie foiled deUvery of a 
packet with the link feihire where a source address of the packet corresponds to the first 
node address and a destination address of the padcet corresponds to tiie second node 
address. 

30 In another aspect of the present invention any of the steps are performed with 

respect to a connectionless network. 
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In another aspect of the present invention a method the correlating step includes 
correlating a "no route to destination" error. 

In another aspect of the present invention the associating steps comprise 
constructing a connectivity table, 
5 In another aspect of the present invention the method flirther includes 

suppressing the error. 

In anoth^ aspect of the present invention any of the steps are performed in a 
distributed network management system by at least one software agent associated with 
eitiier of the routers. 

10 In another aspect of the present invention the method fiirther include notifying 

at least one other agent in the network of the associations of the, nodes to the partitions, 
where the other agent is not associated with either of the routers. 

In another aspect of the present invention a method is provided for correlating 
routing errors to link feilures in a network, the method including identifying a path between 

15 a first node and a second node in a network, detecting a link failure in the network, 
determining if the link fiulure lay along the path, and correlating an error notification 
resulting fi^om the fidled delivery of a packet with the link feilure where a source address of 
the packet corresponds to an address of the first node, where a destination address of the 
packet corresponds to an address of the second node, and wh^:e the link fidlure lay along 

20 the path 

In another aspect of the present invention the identifying step includes 
identifying either of a most commonly used route and a most heavily used route between the 
nodes in accordance with a predefined measure of use. 

In another aspect of the present invention any of the steps are performed with 
25 respect to a coimectionless network. 

In another aspect of the present invention the correlating step includes 
correlating a "no route to destination" error. 

In another aspect of the present invention the method fiirther includes 
suppressing the error. 
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In another aspect of the present invention any of the steps are perfonned in a 
distributed network management system by a software agent assodated with either of the 
routers. 

In another aspect of the present invention a system is provided for correlating 
routing errors to link feilures in a netwoik, the system inctoding means for detecting a link 
Mure between a first and a second router in a network, means for assodating a first node 
address indicated in a first routing table of the first router with a first partitioh of the 
netwotfc. whdre a next hop of a paidkBt destined for the ftet node ad^^ 
router, means for assodating a second node address indicated in a second routing table of 
the second router with a second partitioh of the network, vibexe a nejct hop of a packet 
destined for the second node address is the first router, and mesaas fijr correlating an error 
notification resulting from the failed delivery of a packet with the link feilure where a source 
address of the packet corresponds to the first node address and a destination address of the 
padcet corresponds to the second node address. 
15 In another aspect of the present invention any of the means are operative with 

req)ect to a connectionless network. 

In another aspect of the present invention the means for correlating is operative 
to corrdate a **no route to destination" error. 

In another aspect of the presrait invaation tiie means for assodating are 
20 operative to construct a connectivity table. 

In another aspect of the present invention the system finther inchides means for 

suppressing the error. 

In another aspect of the present invention a system any of the means are 
operative in a distributed network management system including at least one software agent 
25 assodated with dther of the routers. 

In another aspect of the i^esent invention the system fiurther includes means for 
notifying at least one other agent in the network of the assodations of tiie nodes to the 
partitions, where the oilier agent is not assodated wifli eitiier of the routers. 

In another aspect of the present invention a system is provided for corrdating 
30 routing errors to link Mures in a network, the system induding means for identifying a patii 
between a first node and a second node in a network, means for detecting a link feilure in 
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the network, means for detennming if the link feihire lay along the path, and means for 
correlating an error notification resulting firom the feiled delivery of a packet with the link 
failure where a source address of the packet corresponds to an address of the first node, 
where a destination address of the packet corresponds to an address of the second node, 
5 and whrare the link feilure 1^ along the path. 

Jn another aspect of the presait invention the means for identifying is operative 
to identify either of a most conamonly used route and a most heavily used route between the 
nddeii in accordance with a predefined measure of use. 

In another aspect of the pres^ invention imy of the means are operative with 
10 respect to a connedionless netwoik. 

la andtho- aspect of the present inveiition the means for cbffelating step iis 
operative to correlate a "no route to destination" error. 

In another aspect of the presort inv«rtion tiie ^st«n further indudes means for 
suppressing the error. 

15 In anothwaq)ectofthe present invention any ofthe means are operative in a 

distributed network manageootart system including a software agent assodated with either 
of the routers. 

BRIEF DESCRIPTION OF TEBE DRAWINGS 
20 The present invention vwll be xmderstood and af^nedated more fiiDy firom the 

following detailed description taken in conjimction with the impended drawings in which: 

Fig. 1 is a simplified pictorial ilhistration of a network fiamework, usefiil in 
understanding present invention; 

Fig. 2 is a simplified pictorial illustration of a network fi-amework supporting 
25 error corrdation, constructed and operative in accordance with a preferred onbodiment of 
the present inv^ition; 

Fig. 3 is a simplified flowchart ilhistration of a meHhod of condation of routing 
errors to link Mures in a connectionless network, op^tive in accordance with a preferred 
embodiment of the presart invention. 
30 Fig. 4 is a dmplified flowdiart ilhistration of a method of condation of roviting 

errors to Imk Mures in a connectionless network aipported by a distributed network 
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management system, operative in accordance with a preferred embodimrait of the pres«it 
invention; and 

Fig. 5 is a simplified flowchart iflustration of a method of identifying nodes that 
may be affected by link fiulures in a connectionless network, operative in accordance \wth a 
S preferred embodiment of the present invention. 

DETADJED DESCMPTION OF PREFERRED EMBODIMENTS 

Reference is now made to*FigJ 2, which is a mq^Med pictorial iOustration of a 
« ■ ■ ' ^ . . . • ' . • . ' - 

network framework supporting error correlation, constructed and operative in accordance 

10 with a preferred CTibodixnent of the presait mventiQn, and additionally to Fig. 3, which is a 
simplified flowchart illustradon of a method of correlafton of roiitiiig OToirS to link Mures 
in a connectionless network, operative in accordance with a preferred embodiment of the 
present invention. In Fig. 2 a Unk 200 between two routers Rl and R2 is shown as having 
felled, as designated by an 'x' through link 200. Prior to the failure of Imk 200, a routing 

15 table 202 of router Rl shows that tide next hop for packets destined for B and D is R2, 
while a routing table 204 of router R2 shows that the next hop for packets destined for A 
and C is Rl. It m^ be seen that two partitions 206 and 208 (shown in dashed hues) are 
thus created in that nodes A and C cannot communicate with nodes B and D via link 200, 
and vice versa. 

20 A network management system (NMS) 210 prefeiably maintains copies of 

routing tables 202 and 204. HaNong detected a link feilure between Rl and R2, ISEMLS 210 
may create a connectivity table 212 indicating which nodes are in each of partitions 206 and 
208. Since NMS 210 knows that R2 is inaccessible to Rl via Imk 200, NMS 210 may 
associate with partition 206 those node addresses in its copy of routing table 202 whose 

25 next hop is R2. Likewise, NMS 210 m^ assodate witii partition 208 those node addresses 
in routing table 204 whose next hop is Rl. Should NMS 210 receive a "no route to 
destination" error notification firom a network router together with the source and 
destination addresses of the packet that could not be delivered, NMS 210 may look up the 
source and destination addresses in connectivity table 212 to determine whether they are 
30 fi"om diflferent partitions. If both the source and destination addresses are firom different 
partitions, then the '*no route to destination" error notification may be an attempt to send 
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the packet across fiuled link 200. Thus^ the error notification may be correlated with the 
link failure that is already known to NMS 210, and the error may be suppressed and need 
not be investigated further. Alternatively, the error notification should not be correlated 
with the link failure and may be investigated or otherwise acted upon by NMS 210. 
5 Reference is now made to Fig, 4, which is a simplified flowchart illustration of a 

method of correlation of routing errors to link Mures in a connectionless network 
supported by a distributed network man^ement system, optative in accordance with a 
prefmed embodiment of the present invention. In Fig. 4 the piresent mvention is 
implement^ in a distributed network management sy^em, such as is.desaibed in. U.S. 
10 Patent Application No. 09/799,637 and published as Published Application No. 
20010039577, v/b&re every router has an assodated software agent which continuously 
monitors the state of the router and its links. The agents monitoring Rl and R2 would thus 
detect the failure of link 200 and then conmmnicate with each other to create connectivity 
table 212 which may then be provided to the agents of all other routers in the network. 
15 Thus, when any router Rx encounters a "no route to destination" error, its associated agent 
looks up the source and destination addresses in coimectivity table 212 to determine 
whether they are firom different partitions, and action may be taken or the error notification 
ig^ored as described above. 

Reference is now made to Fig. 5, which is a ^plified flowchart illustration of a 
20 method of identi^ing nodes that may be affected by link fiulures in a connectionless 
network, operative in accordance with a prdfenred embodiment of the present invention. In 
Fig. 5 a list of virtual paths in a network is msuntained, where each virtual path represents 
the traversal of the links, routers, and other network elements comprising the most 
commonly used and/or most heavily used routes between network nodes, as determined 
25 using any predefined measure of use. The virtual path list may be maintained centrally, such 
as by NMS 210, or in a distributed manner, such as by one or more agents in a distributed 
network management i^em. The >drtual path list may be created using any conventional 
technique, such as by idrati^^ing common access patterns in router access lists, analyzing 
network failure alarms (e.g., packet lost, no route, etc.) to detemune traffic flow, and 
30 determming network tomography fipom traffic counter patterns. When a failed link is 
detected, each virtual path may be checked using any known technique to determine if it is 
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broken and, if so, which nodes and other network elemmts along the path are affected. 
Thereafter, should a '*no route to destination" error be encountered where the source 
address of the packet being sent belongs to the node at one end of a virtual path known to 
have a foiled link, and the packet's destination address belongs to the node at the other end 
5 of the virtual path, the error may be correlated to the failed link and action may be taken or 
suppressed as described hereinabove. 

It is appreciated that one or more of the^ steps of any of the methods described 
herdn may be omitted or carried out in a difBar^t order than that shown, without departing 
.from the true spirit and scope of the invention. 
10 While the methods and apparatus disclosed her^ may or may not have been 

described vAth refermce to specific hsffdwaone br software, it is appr^dted that the methods 
and apparatus described herdn may be readily implemented in hardware or software using 
conventional techniquies. 

While the present invention has been described with reference to one or more 
15 specific embodiments, the description is intended to be illustrative of the invention as a 
whole and is not to be construed as limiting the invention to the embodiments shown. It is 
appreciated that various modifications may occur to tiiose skilled in the art that, while not 
specifically shown herein, are nevertheless within the true spirit and scope of the invention. 
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CLAIMS 

What is claimed is: 

1 . A method for correlating routing errors to link £ulures in a network, the method 

comprising: 

detecting a link failure between a first and a second router in a network; 

associating a first node address indicated in a first routing table of said first 
router with a first partition of said network, wherdn a n&A hop of a packet destined for said 
first node address is said second router; 

assodating a second node address indicated in a second routing table of said 
second roiiter with a second pardtion of said network, wherein a next hop of a packet 
destinedfbr said second node address is said first routei; and 

correlating an error notification resulting firom the &iled delivery of a packet 
with said link feilure where a source address of said packet corresponds to said first node 
address and a destination address of said packet corresponds to said second node address. 

2. A method according to claim 1 wherein any of said steps are performed with 
respect to a connectionless network. 

3. A method according to claim 1 wherdn said correlating step comprises 
correlating a "no route to destination" error. 

4. A method according to claim 1 wherein said associating steps comprise 
constructing a connectivity table. 

5. A method according to claim 1 and fiirth^ compri^g suppressing said error. 

6. A method according to claim 1 wherein any of said steps are performed in a 
distnbuted network management system by at least one sofiiware agent assodated with 
either of said routers. 
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7. A method according to claim 6 and fiirther comprising notifying at least one 

other agent in said network of said associations of smd nodes to said partitions, wherein 
said other agent is not assodated with either of said routers. 

5 8. A method for correlating routing errors to link feihjres in a network, the method 

comprisong: 

identifying a path between a first node and a second node in a network; 

dete(ating a liiflc fittlure in said netwoiic; 

determining if said link Mure ky along ssdd prth; and 
10 correlating an eiror notification resiilting fixim the fiiiled delivery of a packet 

with said link Mure where a source address of said padcet corresponds to an address of 
said first node, where a destination address of said packet corresponds to an address of said 
second node, and where said link feihjre lay along said path. 

15 9 A mettiod according to daim 8 wherein said identifying step comprises 

identifying dther of a most commonfy used route and a most heavity used route between 
said nodes in accordance with a predefined measure <rfuse. 

10. A mefliod aocorxling to daim 8 wherein any of said steps are performed with 
20 respect to a connectionless network. 

11. A method according to daim 8 wherein said corrdating step comprises 
correlating a "no route to destination" error. 



25 12, 



A method according to claim 8 and fiurther comprising suppressing said error. 



13. A metiiod according to dahn 8 wherdn any of said steps are praformed in a 

distributed networic management system by a software agent assodated with dther of said 



routers. 



30 
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14. A system for coirdating routing errors to link Mures in a network, the system 

comprising: 

means for detecting a link feilure between a first and a second router in a 

network; 

5 means for assodating a first node address indicated in a first routing table of 

said first router with a first partition of said network, wherein a ne?ct hop of a packet 
destined for said first node address is said second router; ^ 

means for assodating a second node address indicated m a second routing table 
of said second router with a second partition of said network, wherein a next hop of a 

10 packet destined for said second node ad^-ess is said first router, and 

means for correlating an ertor notification resulting firom the Med deliv«y of a 
packet with said link failure where a source address of said packet corresponds to said first 
node address and a destination address of said packet corresponds to said second node 
address. 

15 

15. A qrstem according to daim 14 wherein any of said means are operative with 
respect to a connectionless network. 

16. A system according to daim 14 herein said means for correlating is operative 
20 to correlate a ^o route to destination" error. 

17. A system according to claim 14 wherdn said means for assodating are 
operative to construct a connectivity table. 

25 18. A system according to daim 14 and fiirther comprising means for suppressmg 

said error. 

19. A system according to daim 14 wherdn any of said means are operative hi a 

distributed network management system compriidng at least one software agent assodated 
30 with either of said routers. 
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20. A system according to daim 19 and further comprising means for notifying at 

least one other agent in said netvwork of said assodations of said nodes to said partitions, 
wherein said other agent is not associated with dther of sud routers. 

5 21. A system for correlating routing errors to link feilures in a network, the system 

compriang: 

means for identifying a path between a first node and a second node in a 

networl^ 

means fiar detecting a l^c fiu^e in said network; 
10 means for detenniimig if said link Mure lay along said pal^ 

means for coridating an enx)r notification resulting firom the Med delivery of a 

packet with said link fmlure where a source address of said packet corresponds to an 
address of said first node, where a destination address of said packet corresponds to an 
address of said second node, and where said link failure lay along said path. 



15 



20 



25 



22. A Systran according to claim 21 wherdn said means for identifying is operative 
to identify either of a most commonly used route and a most heavity used route between 
said nodes in accordance with a preddBned measure of use. 

23. A system according to claim 21 wherdn any of said means are operative with 
respect to a connectionless network. 

24. A system according to claim 21 wherein said means for corrdating step is 
operative to correlate a "no route to destination" error, 

25. A system according to claim 21 and fiirther comprising means for suppressing 
said error. 



26. A system according to claim 21 wherein any of said means are operative in a 

30 distributed network management system comprii^g a software agent assodated with either 
of said routers. 
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